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ABSTRACT 
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result replicability: double cross-validation, and bootstrap 
procedures. A commonly available statistical computer package, the 
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procedure, and a recently developed microcomputer program package 
(developed by C. E. Lunneborg, 1987) is implemented to demonstrate 
the bootstrap logic. Both methods are applied to a heuristic data set 
of observed values of three independent variables and one dependent 
variable for a sample of 25 subjects. It is concluded that although 
each procedure has some shortcomings, the advantages of using either 
far outweigh the disadvantages. There are 5 tables of analysis data 
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Abstract 

Statistical significance is often inappropriately equated 
with evaluating result importance and evaluating result 
replicability, even though these are three somewhat different 
issues. The prudent researcher must separately assess each of 
these elements of the "research triumvirate" by using different 
methods. This paper focuses on two types of empirical methods 
for estimating research result replicability, double cross- 
validation and bootstrap procedures. A commonly available 
statistical computer package, SPSS-X, is used to carry out the 
steps required for the double cross-validation procedure, and a 
recently developed microcomputer program package (Lunneborg, 
1987) is implemented to demonstrate the bootstrap logic. Both 
methods are applied with a heuristic data set using multiple 
regression analysis so that the discussion is concrete. 
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Estimating Result Replicability using Double 
cross-validation and Bootstrap Methods 

Many researchers in the social sciences have invested 
unwisely in "statistical significance testing" stock only to find 
that its market value continues to shrink as the limitations of 
significance testing are more widely understood, carver (1978) 
asserted in the Harvard Educational Review that too many 
researcher use statistical significance testing to support 
"fantasies. " 

One of these fantasies is to equate evaluating statistically 
significant results with evaluating result importance or result 
replicability. These null hypotheses (e.g., H c : statistical 
significance = result importance) must always be rejected by the 
careful researcher. It may happen that in a given study the 
results prove to be statistically significant, important (at 
least by the value judgment of the researcher), and replicable, 
but when these three descriptors are used appropriately, they are 
assessed using three different methods (Thompson, 1989) . 

To determine if results are statistically significant, one 
can quickly and mechanically "decide* 1 if a given null hypothesis 
(e.g., H^n^^) at a specified alpha level should be rejected or 
fail to be rejected. But results that are not statistically 
significant cannot automatically be assumed to be unimportant. 
This hasty generalization has produced a plethora of unpublished 
studies that may not have been statistically significant, but may 
have been useful nonetheless. Moreover, this somewhat arbitrary 
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discrimination procedure, which is often used by journal editors, 
has closed off potential research avenues (Atkinson, Furlong, & 
Wampold, 1982; Greenwald, 1975). 

When researchers do not "achieve" statistical significance 
with their results, they may find it useful to ask, "At what 
larger a would these results be statistically significant?, » 
since sample size is the primary influence on statistical 
significance (Thompson, 1989). Researchers should consider 
effect size in order to further evaluate the importance of their 
results. In multiple regression, the squared multiple 
correlation coefficient, R 2 , is the effect size. This indicates 
the percent of variance of the dependent variable explained by 
the predictor variables. One possible effect size measure used 
in ANOVA is called eta squared or the correlation ratio. It 
indicates the percent of variance of the dependent variable that 
is explained by a given treatment or group. Many other effect 
size estimates are available in determining result importance. 

Even if results are statistically significant and yield a 
very large effect size, they still may not be important, at least 
to some researchers. Result importance is inherently an 
inescapable personal value judgment. Mathematical calculations 
may help to inform these judgments, but cannot automate the value 
judgment process. Therefore, result importance is "judged" by 
carefully weighing the above mentioned factors and by considering 
the phenomenon being expleuned. Only then can the researcher 
zake an informed value judgment as to the overall "significance" 
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or importance of the results. 

Statistical significance testing is easy to carry out. 
Result importance can be rather painlessly determined by looking 
at several factors. But how does one determine result 
replicability, the essential element of the "research 
triumvirate"? The best way of predicting result replicability, 
or stability across samples, is to validate result stability 
empirically by conducting replications on as many independent 
samples as possible and by then comparing the results. Ker linger 
(1986) explained the importance of replication: 

If a study is replicated and the same or similar results 
are found, our trust and confidence in the results are 
increased. If the study is again replicated and the same 
results are obtained, our trust and confidence are greatly 
increased because the probability of obtaining the same 
results three times by chance is lower than the probability 
of obtaining the same results twice, (p. 124) 
In the social sciences it is often impractical to conduct 
numerous replication studies to determine result 
genera lizability; instead, the stability across samples can be 
estimated using one of three types of techniques: double cross- 
validation procedure (Mitchell & Klimoski, 1986; Mosier, 1951; 
Pedhazur, 1982; Rowell, 1991; Thorndike, 1978), jackknife method 
(Crask & Perreault, 1977; Tukey, 1958), or bootstrap applications 
(Diaconis & Efron, 1983; Lunneborg, 1987; Thompson & Melancon, 
1990) . 
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Tukey's (1958) jackknife technique, named after the 
versatile and useful Boy Scout's jackknife, involves the 
systematic deletion of different observations or subsets of 
observations followed by the computation and comparison of 
calculated estimators (e.g., 8- weight coefficients, discriminant 
function coefficients) derived from these revised samples. 
Unlike some invariance methods, the jackknife technique allows 
for coefficient stability to be determined using a very small 
sample size (Crask & Perreault, 1977). But this approach tends 
to focus on the influence of outliers on potential result 
replicability. 

This paper focuses on two other techniques for estimating 
result stability, the double cross-validation and bootstrap 
methods. Cross-validation methods involve randomly dividing the 
original sample into subsets, conducting separate analyses, and 
then empirically comparing the results (Thompson, 1989) . 
Bootstrap methods conceptually involve creating a "mega" data 
file by copying the original data set an enormous number of 
times. Random samples are then drawn from the "mega" file, 
analyses are conducted on each new sample, and the impacts of 
numerous different configuration of subjects are then compared 
(Crask & Perreault, 1977) . 

Double Cross-validation Method 
The name cross-validation is used because this procedure was 
originally devised to determine the validity of scoring keys in 
which different weights were given to the items of a test or an 
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inventory (Thorndike, 1978) . The double cross-validation 
procedure (Hosier, 1951) is one of several cross-validation 
strategies used for splitting an original sample, called the 
development sample, into two samples, and then comparing various 
results from the samples to determine the likelihood that the 
original results will replicate. 

The double cross-validation procedure requires seven 
distinct steps, each of which can be easily accomplished by using 
a statistical computer package such as SPSS-X. After a 
description of the steps, specific concepts mentioned within the 
task analysis such as "shrinkage" and "invariance coefficients" 
will be discussed, and the advantages and disadvantages of this 
method will be elaborated. Steps in conducting the double cross- 
validation procedure include: 

1. The original sample of data is randomly divided into two 
subsamples (i.e., subsample 1 and subsample 2) with equal or 
unequal sample sizes. It is usually convenient to use nearly 
equal subsamples that are not exactly the same size. 

2. Each of the variables within the two new subsets (e.g., x n , 
X 12 ,...X 1 j for subsample 1, where the first subscript indicates 
the subsample number and the second subscript tells the sequence 
number of the predictor variable) are converted from raw scores 
to z scores (i.e., standard scores with a mean of 0 and a 
standard deviation of 1) . The conversion is made by using the 
raean and standard deviation of subsample 1 to standardize 
subsample 1 data and by using the mean and standard deviation of 
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subs ample 2 to standardize subsample 2 data (e.g., Z u * [X n - 
X 11 ]/SD X11 for the z scores of subsample l (first subscript), 
predictor variable 1 (second subscript) ) . 

3. Y* X1 values are calculated using z scores from subsample 1 and 
Y*22 values are calculated using z scores from subsample 2. 
These are the actual regression results for each subsample. 

4. Regression analyses are conducted with each subsample* s z 
score data set yielding two regression equations, Y'u 
(pronounced "Y-hat 1 *) for subsample 1 data (first subscript) using 
B-weights derived from subsample 1 (second subscript) and Y' 22 
for subsample 2 data using B-weights derived from subsample 2. 
(Note: For X, Z, 6, and Y* , the first subscript indicates which 
subsample data set is being referenced. For Y • only, the second 
subscript stands for the subsample number froa which the fi- 
ve ights in that regression equation were derived. For X, Z, and 
B, tha second subscript tells the sequence number of the 
predictor variable.) 

Y 'll = B 11 Z 11 + B 12 Z 12 + 6 13 Z 13 + ••• B lj Z lj 

Y *22 = B 21 Z 21 + B 22 Z 22 + B 23 Z 23 + ••• B 2j 2 2; 

5. The B-weights are then crossed such that z scores from 

subsample 1 are used in the B-weight regression equation of 
subsample 2 to calculate Y* 12 and z scores from subsample 2 are 
used in the B-weight regression equation of subsample 1 to 
calculate Y* 21 . 

Y 'l2 ~ B 21 Z 11 + B 22 Z 12 + B 23 Z 13 + ••• B 2j Z lj 
Y '21 35 B U Z 21 + B 12 Z 22 + B 13 Z 23 + ••• B lj Z 2j 



6. Invariance can be evaluated by considering the shrinkage for 
each group. "Shrinkage 0 for subsample l is calculated by 
subtracting the squared multiple correlation coefficient (R 2 12 ) , 
which is the squared bivariate correlation of Y» 12 values and the 
z score values in subsample 1, from the squared multiple 
correlation coefficient (R 2 n ) , which is the squared bivariate 
correlation of Y' u values and the z score values in subsample 1. 

SHRINKAGE^ = R 2 n - R 2 12 
The shrinkage for subsample 2 is similarly calculated. 
SHRINKAGE 2 = R 2 22 - R 2 21 

7. The invariance is also evaluated by calculating two invariance 
coefficients. The first is determined by calculating the 
bivariate correlation coefficient of ¥» n values and Y* 12 values. 
The bivariate correlation coefficient of Y* 22 values and Y» 21 
values is the second invariance coefficient. 

In the task analysis, steps six and seven are crucial for 
evaluating the estimated invariance or stability of the research 
results. One way to investigate the likelihood that results will 
replicate is to measure the shrinkage of the multiple correlation 
coefficient for each subsample. Step six explains the process of 
calculating shrinkage. In the double cross-validation procedure, 
shrinkage of the multiple correlation coefficient occurs when the 
B-weights are "crossed" because the B- weights derived from the 
original subsample yield the highest pos. ible correlation between 
the predictor variables and the dependent variable. Put 
differently, Pedhazur (1982) explained: 
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If one were to apply a set of weights derived from one 
sample to the predictor scores of another sample and then 
correlate these predicted scores with the observed 
criterion scores, the resulting R would almost always be 
smaller that the R obtained in the sample for which the 
weights were originally calculated, (p. 147) 
The reason for shrinkage is that in calculating the weights to 
obtain a maximum R, the zero-order correlations are treated as i 
they were error-free, which is never the case. Because of this 
capitalization on chance, sometimes referred to as "overf itting, 
the original resulting R is biased upwards. 

Mitchell and Klimoski (1986) concluded that shrinkage is 
usually reduced when predictors are chosen based on prior theory 
and experience-based knowledge of predictor-criterion 
relationships (i.e., rational procedures requiring forethought) 
rather than by blind empirical selection (i.e., selected with a 
relatively low level of rationality). Therefore, when rational 
procedures for selecting predictor variables are used instead of 
implementing w data-snooping M or stepwise multiple regression 
techniques (Synder, 1991), it is likely that shrinkage will be 
reduced, and therefore invar iance or stability will increase 
since there is an inverse relationship between shrinkage and 
invariance. In other words, the degree of stability across 
subsamples increases as the two shrinkage estimates approach 
zero. 

However, shrinkage formulas work poorly with small sample 
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sizes (i.e., less than 30 subjects per independent variable), 
unfavorable ratios of sample size to predictor variables (i.e., 
less that 3) and low multiple correlations (i.e., less than .6) 
(Mitchell & Klimoski, 1986; Pedhazur, 1982) . Unfortunately, 
these conditions are prevalent in much of social science 
research; therefore, shrinkage formulas may be less useful under 
these conditions. 

Invariance can also be evaluated by calculating invariance 
coefficients (see step seven). The shrinkage formulas described 
above yield results that have no set metric (e.g., an R 2 
shrinkage from .9 to .7 is not equivalent to a shrinkage from .2 
to 0) . However, this comparison problem is not evidenced when 
invariance coefficients are used since they do have a set metric 
ranging between -1 and +1. The closer the invariance 
coefficients are to one, the greater the degree of confidence the 
researcher has that the results are replicable (Rowell, 1991; 
Thompson, 1989) . 

The advantages of the double cross-validation method are at 
least fourfold. First, this method does not waste data by 
crossing only one set of B-weights. By crossing both sets of 
weights, a more rigorous approach to validation is created 
(Mosier, 1951; Pedhazur, 1982). Second, readily accessible 
statistical packages such as SPSS-X can be used easily to run the 
analyses needed for this procedure. A third advantage of this 
technique is that it saves time and money in that the researcher 
does not have to conduct two separate studies to determine 
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invariance. Finally, in most cases, this method can be used for 
Boderate sample sizes (i.e., at least 30 subjects pe~ predictor 
variable) . 

There are at least four disadvantages of the double cross- 
validation technique. First, not unlike split-half reliability 
coefficients which fluctuate depending on how tfce data is sr-tit, 
invariance coefficients change when different splits <*£ the 
original sample are used. Second, since the sample data under 
examination are usually collected all at one time for 
convenience, any changes due to timing would not be evidenced. 
Third, as is always the case in research investigations, if the 
sample is not representative of the target population, inaccurate 
conclusions may be drawn by using this method. Finally, as 
previously mentioned, shrinkage formulas may not work well with 
small sample sizes, small ratios of sample size to predictor 
variables, and low multiple correlations. 

Bootstrap Procedures 

Bootstrap methods are named after the old saying about 
pulling yourself up by your own bootstraps, in this case by 
creating many samples from only one available sample (Crask & 
Perreault, 1977) . Thirty years ago it would have been 
unthinkable to use the bootstrap logic. Although the actual 
steps required *o use bootstrapping are simple from the viewpoint 
of practicality, the computer is a necessary partner in the 
process. Several microcomputer programs now allow researchers to 
use these methods easily (e.g., Lunneborg, 1987). 
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Following a description of the required steps, advantages 
and disadvantages of bootstrapping will be discussed. The tas> 
analysis of the bootstrap logic includes three definitive steps. 

1. The original data set for each of n subjects is copied a very 
large number of times (e.g., 100,000,000). 

2. "Bootstrap" samples of size n are randomly selected from the 
"mega" file, regression analyses are conducted and the B-weights 
for each sample are calculated. 

3. The mean, standard deviation, and median of bootstrap trials 
for the S-weight estimators are calculated, and various 
confidence intervals are computed also. The original 6-weight 
estimators are compared with the bootstrap information generated 
from resampling. 

Three advantages of the bootstrap logic overlap with those 
of the double cross-validation method. Like the double cross- 
validation method, bootstrap procedures use all of the data and 
do not waste any, can be quickly implemented with easy to use 
microcomputer programs, and provide a savings of time and money. 
Moreover, a unique benefit of bootstrapping is that it does not 
require the assumption that standard er-ors in the observed 
values be randomly and normally distributed in order to work 
effectively, often this assumption is required before 
statistical analysis can proceed, but as Thompson and Kelancon 
(1990) explained: 

It seems illogical to make strong assumptions that standard 
errors are randomly and normally distributed, when one has 
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data in hand that can be employed to empirically estimate 

standard error, (p. 8) 
Computer- intensive bootstrap methods can provide estimates for 
the standard errors of results by using the actual data, rather 
than relying on the assumption that sampling error is normally 
distributed, wh^ch often is not the case. 

Yet another advantage of bootstrapping lies in its power, 
since these methods consider many configurations of subjects in 
their analyses, researchers can draw hypotheses about result 
generalizability across many different groupings of subjects. 

The disadvantages of using bootstrap methods are few. As in 
the double cross-validation method, the influence of time factors 
are not considered since one data set is used instead of two or 
more from different research studies. Also, cne must be cautious 
in making generalizations from a single sample since, like all 
statistical procedures, bootstrapping will give misleading 
answers for a small percentage of the possible samples (Diaconis 
& Efron, 1983) . Finally, these methods require fairly large 
sample sizes to maximize their power. 

Both Methods Applied to a Heuristic Data set 

Result replicability of a readily available data set from 
Edwards (1985, p. 57) was assessed using both the double cross- 
validation and bootstrap methods. Observed values of three 
independent variables (X 1 , X 2 , X 3 ) and one dependent variable 
(DV) for a sample of 25 subjects were used in the multiple 
regression analysis. In practice, the sample size of 25 would be 
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too snail to apply either of the two methods with confidence; 
however, for illustrative purposes, both will be employed. 

The double cross-validation procedure was applied to the 
data using two separate runs of SPSS-X. The computer program is 
found in Table 1. TabJ.e 2 contains the complete raw data set 
with the randomly assigned invariance subsample numbers (i.e., l 
or 2) for each subject so that the reader can re-create the 
results from this example. 



Insert Tables 1 and 2 about here 



After the data set was randomly split into two subsamples 
(step 1) , the SPSS-X program converted the raw scores to z scores 
(step 2) . The conversion was made by using the mean and standard 
deviation of subsample 1 to standardize subsample l data and by 
using the mean and standard deviation of subsample 2 to 
standardize subsample 2 data. See Table 2 for a listing of the z 
scores . 

Regression analyses were then conducted (step 3) with each 
subsample* s z score data set producing two regression equations: 
Y 'll " (+.342957*Z n ) + (+.60406*Z 12 ) + (♦. 188967*Z 13 ) 
Y*22 88 (+.339154*Z 21 ) + (+. 815982*Z 22 ) + (- . 254246*Z 23 ) 
Then Y* n values were calculated using z scores from subsample 1 
and Y' 22 values were calculated using z scores from subsample 2 
(step 4) . Table 2 presents these results. The crossing of the 
3-weight coefficients yielded two new regression equations 
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(step 5} : 

Y» 12 = (+.339154*Z n ) + (+.815982*Z 12 ) + (-. 254246*Z 13 ) 
Y»21 " (+.342957*Z 21 ) + ( + . 60406*Z 22 ) + (+. 188967*Z 23 ) 
Finally, the invariance was evaluated by considering 
shrinkage (step 6) and invariance coefficients (step 7). Given 
that R 2 U was .76249 and R 2 12 was .61121, SHRINKAGE^ equals . 76249 
minus .61121 or .15128. The shrinkage of the squared multiple 
correlation coefficient for subsample one was 15 percent, since 
R 2 22 was .74564 and R 2 21 was .57943, SHRINKAGE 2 equals .74564 
minus .57943 or .16621. The shrinkage of R 2 for subsample 2 was 
16 percent. Since the shrinkage is not zero, these estimators 
xaust be interpreted with some caution. However, since both 
results are similar they give support to stability across 
samples. 

Both invariance coefficients suggest that the original 
regression equation for the full sample is an accurate predictor 
of the dependent variable in this sample and that the equation is 
fairly stable across samples. The two invariance coefficients 
were r Y<11 Y , 12 , which was .8953, and r y . 21 y 22 , which was .8815. 
Since both coefficients are approaching one, stability across 
samples is likely. 

The bootstrap logic was applied to this data set by using a 
package of relatively "user-friendly" microcomputer programs 
(Lunneborg, 1987) . The program gives prompts that ask for 
specific information (e.g., "How many bootstrap samples do you 
want?") in a step-by-step fashion. There is a publication that 
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acc Jtpanies the software (Lunneborg, 1987) . One must enter the 
data set either by hand on the keyboard or by supplying an MS/ DOS 
data file. In order to get B-weight coefficients for the 
regression analyses, one must enter the data already converted 
into z scores. Raw scores are standardized by using the sample 
mean and sample standard deviation of the variable being 
considered (e.g., use mean of X x and standard deviation of X 1 to 
calculate Z xl ) . 

The program REGBOOT generated a series of bootstrap samples, 
and then calculated the 6-weights for each sarple. The B-weights 
for the original sample were also calculated by the REGBOOT 
program. The results were stored in an output file and used for 
other programs to calculate various descriptive statistics. Five 
hundred bootstrap samples were randomly selected from the "mega" 
file created by copying the heuristic data set many times. Table 
3 lists a sampling of the B-weights calculated from the 500 
bootstrap samples. 



Insert Table 3 about here 



Next, the BOOTLV program individually calculated the mean, 
median, standard deviation (which is analogous to standard 
error), skewness and kurtosis of the B-weights for each of the 
predictor variables. Table 4 contains these values. Finally, 
the BOOTCI program computed 90% confidence intervals for each B- 
weight estimator value. Selected results are presented in Table 
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4. BOOTCI can calculate any width intervals (e.g., 95%, 90%, 
75%) and can also provide intervals constructed using several 
different methods (e.g., normal theory, percentile method, bias 
corrected percentile, minimum width) . Results from the BOOTCI 
program for this data set are found in Table 5. 



Insert Tables 4 and 5 about here 



Results from the bootstrap programs indicate that the three 
6-weights derived from the original 25 subject sample data set 
are accurate predictors of the dependent variable in this sample 
and that the equation is fairly stable across samples. The means 
of the B-weights for each predictor variable from the 500 
bootstrap samples were very comparable to the B-weights derived 
from the original sample (see Table 4). 

Conclusions 

Although result replicability is an essential part of the 
research triumvirate (i.e., statistical significance, result 
importance, result replicability) , researchers often either 
ignore result genera lizability or evaluate it in inappropriate 
ways. With the advent of computers, invar iance techniques such 
as the jackknife, double cross-validation, and bootstrap methods 
can be quickly and easily applied to data sets to determine the 
confidence of result replicability. Although each of these 
pro adures have some shortcomings, the advantages far outweigh 
the disadvantages. When actual replication of research studies is 
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net feasible, researchers should always employ one of these 
invar iance procedures to determine result stability over 
different samples. 
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Table 1 



SP8S-X Commands for Double Cross-validation 
Procedure using Heuristic Data Set 
TITLE * Regression Xnvarianoe Procedure 1 

DATA LIST FILB=ABC/ XX 1-2 X2 4-5 X3 7-8 DV 10-11 INV 13 

IF (INV EQ l)Zll»(Xl-11.846)/3.436 
IF ^INV EQ l)Z12«(X2-33.231)/7.328 
IF (INV EQ l)Z13=(X3-14.231)/2.774 
IF (INV EQ 2)Z21«(Xl-10.417)/2.392 
IF (INV EQ 2)Z22=(X2-21)/6.742 
IF (INV EQ 2)Z23«(X3-13.5)/2.78 

IF (INV EQ 1)YHAT11«(.342957*Z11)+(.60406*Z12)+(.188967*Z13) 
IF (INV EQ l)YHAT12«(.339154*Zll)+(.815982*Z12)+(-. 254246*213) 
IF (INV EQ 2)YHAT21«(.342957*Z21)+(.60406*Z22)+(.188967*Z23) 
IF (INV EQ 2)YHAT22=(.339154*Z21)+(.8l5982*Z22)+(-. 254246*223) 
VARIABLE LABELS Y HAT 11 ' SUBSAMPLE 1 DATA USING SUBSAMPLE 1 BETAS 

YHAT12 » SUBSAMPLE 1 DATA USING SUBSAMPLE 2 BETAS 
YHAT21 * SUBSAMPLE 2 DATA USING SUBSAMPLE 1 BETAS 
YHAT22 1 SUBSAMPLE 2 DATA USING SUBSAMPLE 2 BETAS 
PRINT FORMATS 211 TO Y HAT 2 2 (F8.5) 

LIST VARIABLES=X1 TO YHAT2 2 / CAS ES=5 0 0 / FORMAT=NUMBERED 

subtitle 'Regression Using All Data* 

REGRESSION VARIABLES=X1 TO DV/DBSCRIPTIVE8=ALL/ 

DEPENDENT=DV/ ENTER XI X2 X3 
TEMPORARY 

SELECT IF (INV EQ 1) 

SUBTITLE 'REGRESSION FOR SUBSAMPLE #1* 
REGRESSION VARIABLB6=X1 TO DV/DESCRIPTIVES=ALL/ 

DEFENDENT-DV/ ENTER XI X2 X3 
TEMPORARY 

SELECT IF (INV EQ 2} 

SUBTITLE 'REGRESSION FOR SUBSAMPLE #2' 
REGRESSION VARIABLESsXl TO DV / DE8CRI PTI VES=ALL / 
DEPENDBNTsDV/ ENTER XI X2 X3 

TEMPORARY 

SELECT IF (INV EQ 1) 

CORRELATIONS VARIABLES=DV YHAT2 1/ STATISTICS- ALL 
TEMPORARY 

SELECT IF (INV EQ 2) 

CORRELATIONS VARIABLES=DV YHAT2 1/STATISTICS=ALL 
SUBTITLE 'CHECK Z CALCULATIONS ' 
CONDESCRIPTIVE Zll TO YHAT22 
SUBTITLE 1 INVARIANCE RESULTS* 

CORRELATIONS VARIABLES=DV YHAT11 TO YHAT22/STATISTICS=ALL 

Hot©. This program was adapted from Thompson (1989). it 
requires two runs. The first run uses the boldfaced commands. 
The second run includes all the commands listed above. 
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Table 2 

Heuristic Data Set Rav Data, Converted 8 score Data, ana 
Estimated Y Scores Using Double Crossed Regression Equations 
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.63771 
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Z22 



Z23 *HAT 1 1 YHAT12 YHAT21 >~AT22 
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Table 3 



Calculated Beta Weight Coefficients for the Sample of 25 Subjects 
and Seven of the 500 Random Resamplings of the 25 Subjects 



Sample Estimates of the fl-weight Coefficients 

0 .3O1O3E+O0 .66605E+00 .23807E-01 

1 .40761E+00 .80936E+00 -.64835E-01 

2 .20332B+00 .59310E+00 .52362E-01 

3 .39495E+00 .55578E+00 .11935E+00 

4 .36818E+00 .55590E+00 -.10182E+00 

5 .21115E+00 .64848E+00 .28202E-02 



• • b m 

499 .26095E+0G .66821E+00 -.68605E-02 

500 .18396E+00 .51799E+00 .81274E-01 

Note, sample 0 is the original sample of data for the 25 
subjects. The results in the first row are the fl- weights for the 
three predictor variables presented in order: Z xl , z x2 , Z x3 . The 
rows that follow contain 6-weights from the random bootstrap 
samples. 



Table 4 

BOOTLV Bootstrap Results Across 500 Resamplings 
of 25 subjects in Random Configurations 

Statistic First Predictor Second Predictor Third Predictor 

B-weights from 

Original 25 .30103 .66605 .023807 

Mean of B-weights 

from 500 Samples .2925403 .6610973 .01918864 

Standard 

Deviation .1183305 .1288992 .12074520 

Median of 500 

Samples .2932150 .661830 .0162600 
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Table 5 

bootci 90% Confidence Intervals from 500 Bootstrap Trials 

First Predictor Second Predictor Third Predictor 



Estimator .30103 

Confidence Interval 
Method Used: 

Symmetric 
(Normal Theory) 

Percentile 
Method 

Bias Corrected 
Percentile 

Minimum 
Width 



.10534 to 
.49554 

.10730 to 
.49149 

.12417 to 
.51435 

.10093 to 
.47708 



.66605 



.45208 to 
.87712 

.43928 to 
.86267 

.44150 to 
.86395 

.44831 to 
.86809 



.023807 



-.17499 to 
.22316 

-.17820 to 
.23212 

-.15397 to 
.26306 

-.19189 to 
.24087 
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